Ecogroups and maternal haplogroups reveal the ancestral origin of native Chinese goat populations based on the variation of mtDNA D‐loop sequences

Abstract China is rich in goat breeding resources. Officially recognized local goat breeds are mainly distributed in agro‐ecological regions. The population structure and matrilineal origin of native Chinese goats can be used to formulate protection and utilization strategies for these genetic resources. In this study, the genetic structure and maternal origin of native Chinese goats were investigated using mtDNA D‐loop sequences. A total of 329 goat samples from 25 Chinese indigenous goat populations and five introduced goat breeds from abroad were collected; these populations were distributed in four ecogroups designated as Southwest, South‐central, the North China Plain, and Foreign‐ecogroup. A larger average number of nucleotide differences and richer nucleotide diversity were observed in South‐central and Foreign‐ecogroup, whereas these were lower in Southwest. The 216 haplotypes divided into several haplogroups, of which HapA contained 99 haplotypes distributed in Southwest, the North China Plain, and Foreign‐ecogroup with high frequency (0.53–0.77), whereas the frequency of HapA in South‐central was <0.09. HapB was mostly found in South‐central (0.5538) and was distributed to the North China Plain (0.2667), while it was rare in Southwest (<0.08) and Foreign‐ecogroup (<0.07). According to the estimation of kinship and ancestry, HapA had five ancestors (A2, A3, A5, A10, and A12), HapB had a single maternal ancestor (A8), and HapC had two maternal ancestors (A1 and A4). This study showed that native Chinese goat breeds were mainly divided into three haplogroups (HapA, HapB, and HapC) and goat populations have expanded in the ecological regions.


| INTRODUC TI ON
China has rich goat resources, including officially recognized local and cosmopolitan breeds, and has a long breeding history (Tu et al., 1989). A study suggested that the common ancestor of Chinese goats originated in western Iran, and modern Chinese goat breeds have differentiated into two large groups that adapt to dry cold environments in the north and humid hot environments in the south (Cai et al., 2020). In addition, some ancient indigenous goat populations were identified in the third livestock and poultry genetic resources survey in China. For example, Qianqiu goats (QQ, Figure 1), which originated in Tianchang City, China, have been reared for more than 1270 years. During this time period, there was a legal system that governed growing grass and rearing sheep, and people paid taxes with sheep skins (Xia et al., 1992). However, the evolutionary history and migration routes of these Chinese goat breeds remain poorly understood. In China, goat breeding for mutton, milk, sheepskin, and cashmere has a long tradition as well as economic importance for providing Chinese people with winter food and warm clothing. China is comprised of complex terrain and varying climate with frigid, temperate, and torrid zones, and nearly 50 Chinese indigenous goat breeds (Tu et al., 1989). According to the list of Chinese breed resources, there are more than 70 goat breeds in China (National Animal Husbandry Station (China), 2022). The supply of goat meat has increased in China with consumer demand.
In the 1970s, China introduced foreign goat breeds to improve the carcass and lactation performance of local goats, which enriched the genetic basis of native goats in China but also brought severe challenges to the protection of indigenous Chinese goat breeds Jin et al., 2012). Therefore, it is important to identify and protect the genetic resources of goats in China. The limited information available on the characteristics of indigenous Chinese goats is mostly based on phenotypic features, which can be subjective and dependent on the environment, making it difficult to distinguish between populations (Falconer & Mackay, 1996).
The majority of native goat breeds in China are distributed in the southwest plateau (SWG), south-central mountains and hills (SCG), and North China Plain to Yangtze River (NYR) regions ( Figure 2).
The barrier effects of rivers and mountains have led to the formation of many ecological groups (ecogroups) in Chinese goat populations . However, the real origins of native Chinese goats remain unexplored and enigmatic (Wang et al., 2017;Wei et al., 2014). Little is known about how and when local goats inhabited their domestic regions, the phylogenetic relationships within and among populations, and their migration routes associated with important social and cultural characteristics.
Polymorphisms of mitochondrial DNA (mtDNA) are widely used to study the origin, maternal lineages, and domestication processes, and to clarify the migration routes of goat populations (Al-Araimi et al., 2017), which were reflected in the complete mitogenomics (Doro et al., 2014;Piras et al., 2012;Zhang et al., 2016).
Additionally, this will help to construct the populations with nuclear families or rare haplotypes for specific exploitable aims in the maternal genetic characteristics (Deniskova et al., 2020), and to elucidate the impact of introduced goats on local goat genetic resources, providing a basis for the protection of indigenous goat breeds in China. In this study, we aimed to assess the genetic diversities and to determine the maternal origins and population history of indigenous Chinese goat populations by analyzing D-loop mtDNA sequences.

| Sample collection
A total of 329 samples were collected from 30 goat breeds, of which this included 299 unrelated goats from 25 indigenous goat populations in major ecoregions and 30 goats from five introduced foreign goat populations. All goat populations were divided into four ecogroups: SWG, SCG, NYR, and IBG (Table 1). To establish the genetic relationships of the studied goat populations with goats from different geographical locations, we included an additional 300 mtDNA D-loop sequences from the NCBI database (Table S1).   300 samples from the other 29 populations were from officially recognized goat breeds, and the mtDNA D-loop sequence information was seen in Table S1.

| PCR amplification, purification, and sequencing
Genomic DNA was extracted from whole blood using commercial DNA-Extran-2 kits (CJSC Syntol), according to the manufacturer's in- for 90 s; and a final extension at 72°C for 10 min. The quality of PCR products was evaluated using 1% agarose gel electrophoresis with ethidium bromide (0.5 μg/mL).
E-Gel CloneWell II agarose gels (Thermo Fisher) were used to purify the PCR products. Purified PCR products were submitted to General Biology (Anhui) for Sanger sequencing from both ends.

| Sequence alignment and data analysis
The nucleotide sequences were edited and aligned using DNAStar EditSeq and MEGA version 11 (Tamura et al., 2021) software. DnaSP version 6 was used to calculate the haplotype number, haplotype diversity, nucleotide diversity, variable site number, and average number of nucleotide differences (Rozas et al., 2017). To determine the source of genetic variation among populations, AMOVA, mismatch distribution, and neutrality tests were performed using Arlequin version 3.5 (Excoffier & Lischer, 2010).
BIONJ was used to construct the phylogenetic tree using all generated sequences with 1000 bootstrap iterations (Gascuel, 1997).
Haplotypes were determined using DnaSP version 6 software (Librado & Rozas, 2009), and median-joining networks for various haplotypes were constructed using PopART version 4 software (French et al., 2014). The degree of differentiation and population structure of the investigated breeds were assessed using ADMIXTURE version 1.3 software (Alexander et al., 2009). A good value of K exhibits a low cross-validation error compared with other K values (K = 2-15).
The MER per site of the mtDNA D-loop sequences was estimated using the Tamura and Nei (1993)  The nucleotide substitution frequencies in the D-loop sequences were analyzed to identify, high-frequency site regions (HFSR). From Figure 3, the results showed that these sites were clustered in 300-600 bp and 1000-1200 bp sequences, with a higher mean relative evolutionary rate (MER) for each site. The MERs of the HFSR evolved faster than average, and those of the SCG evolved faster than other ecogoups.

F I G U R E 3
The mean (relative) evolutionary rate per sites of mtDNA D-loop of 329 goat individuals. These rates are scaled such that the average evolutionary rate across all sites is 1. This means that sites showing a rate <1 are evolving slower than average, and those with a rate >1 are evolving faster than average. These relative rate were estimated under the Tamura and Nei (1993) model (+G). NYR, SCG, SWG, and IBG were the ecogroups defined in Figure 2.

| Maternal haplogroup
Phylogenetic analysis was conducted to assess the relationship between the ecogroups and populations and their respective maternal origins. The 329 sequences were used to construct the neighborjoining (BIONJ) tree and median-joining (MJ) network ( Figure 4). Both the tree in Figure 4a and MJ network in Figure 4b showed that all goats were classified into distinct groups, which represented haplogroups A (HapA), B, and C, containing 99, 52, and 37 haplotypes, respectively.
The haplotype profile was constructed using the haplogroup distributions of the different ecogroup populations in Figure 5.

| Population structure and matriarchal ancestries
The body-color and habitats of goats were compared by AMOVA (  (p < .01) and within breeds was 77.54% (p < .01). These results suggest that diversity is related to the geographical distribution of the goat populations.
The matriarchal ancestries of the goat breeds were estimated using the ADMIXTURE program, with K expected clusters ranging from 2 to 15. The optimal number of ancestral populations with the lowest CV error was K = 13 (Figure 6a). At K = 6 or 7, the distinct classification of goat breeds only remained at the ecogroup level. At K = 13, the maternal ancestors among the breeds became clear (Figure 6c). The ratio (RDA) of goats with more than 90% ancestral components (dominant ancestors) was calculated in Figure 6b. The top four dominant ancestors were A8 for SCG (46.92%), A2 for IBG (33.33%), A3 for NYR (26.67%), and A5 for SWG (20.18%), respectively. This was followed by A8 for NYR (18.33%), A1 for SCG (16.15%) and SWG (13.76%), A4 for SWG (14.68%), and A10 for SWG (10.09%) and NYR (11.67%). These

F I G U R E 6
The Population structure of 30 goat breeds. (a) estimating CV error with K ancestry clusters; (b) % of goats with more than 90% of ancestral genetic components; (c) Estimated population structure of the analyzed breeds represented as bar plot depicting individual membership proportions for each cluster. The codes of breeds and ecogroups were showed in Table 1 and Figure 2, respectively.

F I G U R E 7
The maternal ancestors of three Haplogroups (b) and the mean relative evolutionary rate of each sites in the ancestor mtDNA D-loop (a). SWG, SCG, NYR, and IBG were ecogroups and defined in Figure 2. A2, A3, A5, A10 and A12 were the ancestors of HapA, A8 was the ancestor of HapB, and A1 and A4 were the ancestors of HapC (b).

F I G U R E 8 Mismatch distribution graphs for the 30 goat populations in China and Haplogroups A, B, and C analyzed in this study. The
x-axis shows the number of pairwise differences, and y-axis shows the frequency of the pairwise comparisons.
results indicate that Chinese native goats in different ecoregions originated from two to three matrilineal ancestors and had no homology with imported goats.
HapA was associated with five ancestors (A2, A3, A5, A10, and A12), HapB was associated with a single maternal ancestor (A8), and HapC was associated with two ancestors (A1 and A4; Figure 7b). the MER of each site in the mtDNA D-loop sequence of different maternal ancestors was estimated ( Figure 7a) and found that the MER of HFSR of ancestors A2 and A10 were higher than those of A1, A5, and A4 and that the MER of HFSR of A8 was between these two groups of ancestors.

| Population demographic history based on the ecogroups and haplogroups
Past population expansion events were inferred based on the patterns of mismatch distributions ( Figure 8) and neutrality test estimates ( Table 4). The mismatch distributions showed ragged and bimodal patterns for each ecogroup and haplogroup, and the observed pattern did not deviate significantly from that expected under a null hypothesis model of either spatial or demographic expansion.
Tajima's D values for the goat ecogroups were not significantly different from zero for all populations, except for NYR (p < .05). Fu's F s estimates for the goat ecogroups were negative and statistically sig-

| DISCUSS ION
This study is an assessment of the genetic diversity, population structure, haplotype profile, and matriarchal ancestry of goat populations mainly from agro-ecological zones in China where indigenous goats are reared, and provides insight into the genetic history of these goat populations based on mtDNA D-loop sequences.
Previous studies that investigated genetic diversity using AFLP  and microsatellite (E et al., 2019) markers and maternal origin (Li et al., 2021;Liu et al., 2006) only considered a few populations representing a few agro-ecological zones. In this study, we Chinese goats than previous studies (Luikart et al., 2001;Zhong et al., 2013). However, there is insufficient evidence that the genetic diversity of black goats is higher than that of white goats (Zhong et al., 2013).
Using the available goat mtDNA haplogroup classification system (Naderi et al., 2007) oldest population; HapA's wide distribution was consistent with the world (IBG, 0.7667) scenario described in previous studies (Naderi et al., 2007) in all goat breeds in the world. Interestingly, the frequency of HapA in the SCG (<0.09) was very low, and the expected haplotype flow was not observed. In contrast, HapB was mostly found in SCG (0.5538) and was distributed to NYR (0.2667) which is 1000 km away, whereas it was rarely distributed in SWG (<0.08), which is adjacent to SCG, and IBG (<0.07). These results indicate that HapB may belong to the original haplotype group of native Chinese goats and that the effect of human migration or commercial trade (Fernández et al., 2006;Nguluma et al., 2021) on haplotype distribution has been weak.
Analysis of animal kinship and ancestry and estimation of population purity and protection have been performed on cattle and sheep populations (Ben Sassi-Zaidy et al., 2022;Zheng et al., 2020). According to the assessment of the investigated population in Figure 6, the higher RDA of A8 in SCG may be evidence for TapB as the maternal ancestral haplotype; However, its distribution in IBG is 0 and its proportion in SWG (4.59%) is low. From the population demographic parameters estimated from the analysis of the complete mtDNA D-loop, Fu's F s values of the HapB and HapC haplogroups were negative but not statistically significant, indicating no population expansion.

| CON CLUS IONS
Native Chinese goat breeds were mainly divided into three haplogroups: HapA, HapB, and HapC. HapA, HapB, and HapC were associated with five, one, and two maternal ancestors, respectively. The goat populations expanded in the different ecoregions, indicating that native Chinese goats originated from two to three matrilineal ancestors and had no homology with imported goats.

ACK N OWLED G M ENTS
This study was supported by the Qianqiu Goat Protection Project of Chuzhou, China (TC202244).

CO N FLI C T O F I NTER E S T S TATEM ENT
No potential conflict of interest was reported by the authors.

DATA AVA I L A B I L I T Y S TAT E M E N T
All sequences can be seen on GenBank, with accessions provided in Table S1.